-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pkg/stanza: add backpropagation of number of processed entries #16452
pkg/stanza: add backpropagation of number of processed entries #16452
Conversation
753168a
to
c57b55e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me in general.
The unit tests are great, but I think given the overall purpose of this change, we should also have a few tests cases that validate this the return values are propagated back through an entire pipeline.
@djaglowski Do you have an idea how to do it technically? Not really sure how the test scenario should look like. Do you want me to create a receiver which consists of several operators, and then validate return value of the last receiver? |
You can build and start a pipeline at the |
This PR was marked stale due to lack of activity. It will be closed in 14 days. |
77e57c0
to
2861aa1
Compare
Foresight Summary
View More Details⭕ build-and-test-windows workflow has finished in 4 seconds (45 minutes 13 seconds less than
|
Job | Failed Steps | Tests | |
---|---|---|---|
windows-unittest-matrix | - 🔗 | N/A | See Details |
windows-unittest | - 🔗 | N/A | See Details |
✅ tracegen workflow has finished in 1 minute 5 seconds (2 minutes 54 seconds less than main
branch avg.) and finished at 29th Dec, 2022.
Job | Failed Steps | Tests | |
---|---|---|---|
build-dev | - 🔗 | N/A | See Details |
publish-latest | - 🔗 | N/A | See Details |
publish-stable | - 🔗 | N/A | See Details |
✅ check-links workflow has finished in 1 minute 37 seconds (1 minute 59 seconds less than main
branch avg.) and finished at 29th Dec, 2022.
Job | Failed Steps | Tests | |
---|---|---|---|
changed files | - 🔗 | N/A | See Details |
check-links | - 🔗 | N/A | See Details |
✅ changelog workflow has finished in 2 minutes 12 seconds (5 minutes 51 seconds less than main
branch avg.) and finished at 29th Dec, 2022.
Job | Failed Steps | Tests | |
---|---|---|---|
changelog | - 🔗 | N/A | See Details |
✅ prometheus-compliance-tests workflow has finished in 3 minutes 36 seconds (6 minutes 38 seconds less than main
branch avg.) and finished at 29th Dec, 2022.
Job | Failed Steps | Tests | |
---|---|---|---|
prometheus-compliance-tests | - 🔗 | ✅ 21 ❌ 0 ⏭ 0 🔗 | See Details |
✅ build-and-test workflow has finished in 35 minutes 41 seconds (22 minutes 39 seconds less than main
branch avg.) and finished at 29th Dec, 2022.
Job | Failed Steps | Tests | |
---|---|---|---|
unittest-matrix (1.19, internal) | - 🔗 | ✅ 597 ❌ 0 ⏭ 0 🔗 | See Details |
unittest-matrix (1.18, internal) | - 🔗 | ✅ 597 ❌ 0 ⏭ 0 🔗 | See Details |
unittest-matrix (1.18, processor) | - 🔗 | ✅ 1476 ❌ 0 ⏭ 0 🔗 | See Details |
correctness-metrics | - 🔗 | ✅ 2 ❌ 0 ⏭ 0 🔗 | See Details |
correctness-traces | - 🔗 | ✅ 17 ❌ 0 ⏭ 0 🔗 | See Details |
unittest-matrix (1.19, processor) | - 🔗 | ✅ 1476 ❌ 0 ⏭ 0 🔗 | See Details |
unittest-matrix (1.19, extension) | - 🔗 | ✅ 528 ❌ 0 ⏭ 0 🔗 | See Details |
unittest-matrix (1.18, extension) | - 🔗 | ✅ 528 ❌ 0 ⏭ 0 🔗 | See Details |
unittest-matrix (1.19, receiver-0) | - 🔗 | ✅ 2563 ❌ 0 ⏭ 0 🔗 | See Details |
unittest-matrix (1.18, receiver-0) | - 🔗 | ✅ 2563 ❌ 0 ⏭ 0 🔗 | See Details |
unittest-matrix (1.19, exporter) | - 🔗 | ✅ 2450 ❌ 0 ⏭ 0 🔗 | See Details |
unittest-matrix (1.18, exporter) | - 🔗 | ✅ 2450 ❌ 0 ⏭ 0 🔗 | See Details |
unittest-matrix (1.19, other) | - 🔗 | ✅ 4401 ❌ 0 ⏭ 0 🔗 | See Details |
unittest-matrix (1.18, other) | - 🔗 | ✅ 4401 ❌ 0 ⏭ 0 🔗 | See Details |
unittest-matrix (1.18, receiver-1) | - 🔗 | ✅ 1886 ❌ 0 ⏭ 0 🔗 | See Details |
unittest-matrix (1.19, receiver-1) | - 🔗 | ✅ 1886 ❌ 0 ⏭ 0 🔗 | See Details |
integration-tests | - 🔗 | ✅ 55 ❌ 0 ⏭ 0 🔗 | See Details |
setup-environment | - 🔗 | N/A | See Details |
check-collector-module-version | - 🔗 | N/A | See Details |
check-codeowners | - 🔗 | N/A | See Details |
lint-matrix (receiver-0) | - 🔗 | N/A | See Details |
lint-matrix (receiver-1) | - 🔗 | N/A | See Details |
lint-matrix (processor) | - 🔗 | N/A | See Details |
lint-matrix (exporter) | - 🔗 | N/A | See Details |
lint-matrix (extension) | - 🔗 | N/A | See Details |
lint-matrix (internal) | - 🔗 | N/A | See Details |
lint-matrix (other) | - 🔗 | N/A | See Details |
build-examples | - 🔗 | N/A | See Details |
checks | - 🔗 | N/A | See Details |
lint | - 🔗 | N/A | See Details |
unittest (1.19) | - 🔗 | N/A | See Details |
unittest (1.18) | - 🔗 | N/A | See Details |
cross-compile (darwin, amd64) | - 🔗 | N/A | See Details |
cross-compile (darwin, arm64) | - 🔗 | N/A | See Details |
cross-compile (linux, 386) | - 🔗 | N/A | See Details |
cross-compile (linux, amd64) | - 🔗 | N/A | See Details |
cross-compile (linux, arm) | - 🔗 | N/A | See Details |
cross-compile (linux, arm64) | - 🔗 | N/A | See Details |
cross-compile (linux, ppc64le) | - 🔗 | N/A | See Details |
cross-compile (windows, 386) | - 🔗 | N/A | See Details |
cross-compile (windows, amd64) | - 🔗 | N/A | See Details |
build-package (deb) | - 🔗 | N/A | See Details |
build-package (rpm) | - 🔗 | N/A | See Details |
windows-msi | - 🔗 | N/A | See Details |
publish-check | - 🔗 | N/A | See Details |
publish-dev | - 🔗 | N/A | See Details |
publish-stable | - 🔗 | N/A | See Details |
✅ load-tests workflow has finished in 8 minutes 21 seconds (9 minutes 11 seconds less than main
branch avg.) and finished at 29th Dec, 2022.
Job | Failed Steps | Tests | |
---|---|---|---|
loadtest (TestIdleMode) | - 🔗 | ✅ 1 ❌ 0 ⏭ 0 🔗 | See Details |
loadtest (TestTraceAttributesProcessor) | - 🔗 | ✅ 3 ❌ 0 ⏭ 0 🔗 | See Details |
loadtest (TestMetric10kDPS|TestMetricsFromFile) | - 🔗 | ✅ 6 ❌ 0 ⏭ 0 🔗 | See Details |
loadtest (TestTraceNoBackend10kSPS|TestTrace1kSPSWithAttrs) | - 🔗 | ✅ 8 ❌ 0 ⏭ 0 🔗 | See Details |
loadtest (TestMetricResourceProcessor|TestTrace10kSPS) | - 🔗 | ✅ 12 ❌ 0 ⏭ 0 🔗 | See Details |
loadtest (TestTraceBallast1kSPSWithAttrs|TestTraceBallast1kSPSAddAttrs) | - 🔗 | ✅ 10 ❌ 0 ⏭ 0 🔗 | See Details |
loadtest (TestBallastMemory|TestLog10kDPS) | - 🔗 | ✅ 19 ❌ 0 ⏭ 0 🔗 | See Details |
setup-environment | - 🔗 | N/A | See Details |
*You can configure Foresight comments in your organization settings page.
@djaglowski Is this change still wanted according to the #17079? |
My understanding is that the issue is resolved by #17079. The |
a4afe4b
to
70bc711
Compare
@djaglowski I rebased the PR. Backpropagation is added, but it's not used in logstransformprocessor for now. Looking at the #17079 change, I'm not sure if asynchronous processors are align with the pipeline architecture. In current state it will take all logs and claim that they are processed correctly: I can revert this to the desired state of my commit, but not sure what is the decision here. |
I may be wrong, but I don't think processors are necessarily expected to be synchronous. For example,
Can you elaborate on this? I'm not familiar with how this is communicated and what impact it has. |
There is an error backpropagation: Not really sure who and how handle the errors, but reminds me there is some issue with batchprocessor do not backpropagating the errors: When exporter cannot handle more data and has to drop them it returns error to Anyway, I think the PR is ready and making |
In order to justify the additional complexity, we need to have a clear design for how this would be used. It's no longer clear to me what these changes would accomplish. What will we do with the returned integer, specifically? |
Yeah, this is correct, and tracked in open-telemetry/opentelemetry-collector#4646. In general, if we have an asynchronous component, that component essentially acts as a buffer and needs to take care of its own retry logic. Being asynchronous, it can't return errors to the caller, but it can and should retry on errors it gets from the rest of the pipeline. Otherwise it just drops data without providing any feedback and stops conducting backpressure. |
This PR was marked stale due to lack of activity. It will be closed in 14 days. |
@djaglowski any input here? I don't feel this change makes sense in other scenarios (maybe some metric to point how many data has been dropped in the queue) |
Thanks for explaining this, it makes sense to me now why this processor would need special considerations for handling backpressure. Do we have the same problem in the |
It seems that errors are only logged out by the receiver. There is no try to handle recoverable errors or to inform client that data has been dropped |
This PR was marked stale due to lack of activity. It will be closed in 14 days. |
@djaglowski Should I create separate issue for this? |
@sumo-drosiek, I think a separate issue makes sense. |
This PR was marked stale due to lack of activity. It will be closed in 14 days. |
Closed as inactive. Feel free to reopen if this PR is still being worked on. |
Description:
This is fix for #15378
Idea is to backpropagate information how many entries has been processed by pipeline
Link to tracking Issue:
#15378
Testing:
Manual
Updated unit tests
Documentation:
In code comments